System and a method for the detection of multiple number-plates of moving cars in a series of 2-d images

ABSTRACT

A stand-alone computer-camera system capable of extracting car-plate information. This is achieved by using an on-board computer in order to analyze the video stream recorded by the camera sensor, and can be used with any type of camera sensor. The system features specific characteristics making it extremely fast and able to catch plates of cars moving at high-speed. The special algorithms incorporated in this system, are specially implemented, in order to be able to be ported on an embedded computer system, which has usually lower capabilities in terms of processing power and memory than a general-purpose computer.

BACKGROUND

1. Field

An exemplary embodiment of this invention relates to the field of thedetection of Automatic Number Plate Recognition (ANPR) systems. Morespecifically an exemplary embodiment of the invention relates to amethod and a system capable of extracting the location of the carnumber-plate from a series of 2-D images, using a device equipped with acamera of any kind.

2. Description of the Related Art

There are many known devices that are able to detect the location of thenumber plate of a car and then recognize the plate-number producing atthe output an alphanumeric text corresponding to the characters of theplate number.

There are many approaches for performing car-plate detection andrecognition. Most of these systems are based on a Personal Computer tocarry out the required processing tasks. In such systems a videodigitizer samples the camera sensor and a PC, which runs the car-platedetection and recognition software, then processes the data. Howeverthese implementations are not easily portable, are bulky in size,require special power-supply and are difficult to install on site.

When ANPR systems are used for recognizing plates of moving cars inhighway roads, another important characteristic is the recognitionspeed. In order to be able to catch fast-moving cars, the plate detectormust be able to analyze very fast every frame in the video sequence. Thedetection speed depends on the algorithm and the processor speed.Today's common processors or even dedicated digital signal processor(DSP) devices are not able to deliver the required performance.

SUMMARY

An exemplary embodiment of the invention refers to a stand-alonecomputer-camera system capable of extracting car-plates. This isachieved by using an on-board computer in order to analyze the videostream recorded by the camera sensor, and can be used with any type ofcamera sensor. The system features specific characteristics making itextremely fast and able to catch plates of cars moving at high-speed.

The special algorithms incorporated in this system, are speciallyimplemented, in order to be able to be ported on an embedded computersystem, which has usually lower capabilities in terms of processingpower and memory than a general-purpose computer.

BRIEF DESCRIPTION OF THE DRAWINGS

The exemplary embodiments of the invention will be described in detail,with reference to the following figures, wherein:

FIG. 1 illustrates an exemplary car plate location informationextraction system;

FIG. 2 illustrates an exemplary Car Plate Detection Device through whichthe system detects car-plates and extracts the information of thecoordinates;

FIG. 3 illustrates how moving pixels are identified;

FIG. 4 illustrates how each pixel in the background model is modeledthrough the use of the corresponding pixels of some consequent frames;

FIG. 5 illustrates a flow-chart showing an exemplary method fordetecting a car-plate;

FIG. 6 illustrates an exemplary moving portion of the video frame;

FIG. 7 illustrates an exemplary local thresholding approach which isused which employs threshold adaptation using feedback from the systemoutput and more specifically from the Digit Segmentation unit;

FIG. 8 illustrates an exemplary morphological operation;

FIG. 9 illustrates an exemplary run-length encoding technique;

FIG. 10 is a flowchart illustrating an exemplary run-length encodingtechnique;

FIG. 11 illustrates an exemplary technique for the initial labeling andpropagation of labels;

FIG. 12 illustrates an exemplary application of a labeling algorithm;

FIG. 13 is a flowchart illustrating an exemplary conflict-resolvingalgorithm;

FIG. 14 illustrates an exemplary region feature extraction technique;

FIG. 15 illustrates an exemplary pattern classification scheme is usedfor region classification;

FIG. 16 illustrates how digits in a binary plate image appear ascoherent regions;

FIG. 17 is a flowchart illustrating an exemplary technique for abackground-foreground inversion for the regions detected as plates; and

FIG. 18 is a flowchart illustrating an exemplary technique for newinverted runs being de-coded in binary image format.

DETAILED DESCRIPTION

In the current description we refer to the detection of multiplecar-plates from a video sequence and the extraction of the coordinatesof each plate. In accordance with an exemplary embodiment of the presentinvention, location information of car-plates is extracted from an imageframe sequence by using a system like the one shown in FIG. 1. Thissystem uses a camera sensor (11 in FIG. 1) which captures the videoframes (12 in FIG. 1), stores the most recent frame in a memory (13 inFIG. 1) and then processes it with a car-plate detection device (14 inFIG. 1), comprised by a storage section (15 in FIG. 1) and a processingsection (16 in FIG. 1) in order to extract carplates.

The Car Plate Detection Device through which the system detectscar-plates and extracts the information of the coordinates is shown inFIG. 2.

This exemplary system functions as follows: First two consecutive frames1, and (12 in FIG. 1) are input into the Image Data Input Unit (221 inFIG. 2) from the Storage Memory (13 in FIG. 1) and are temporarilystored into the Input Image Data memory (21 in FIG. 2). The data arethen fed into the Moving Object Detection unit (222 in FIG. 2), whichdetects the parts of video frames corresponding to moving objects at anytime and stores the corresponding parts of the video frames into themoving object image data memory (22 in FIG. 2). Data from the MovingObject Image Data memory are then fed into the Automatic ThresholdAdaptation unit (223 in FIG. 2), which calculates the optimal localbinarization threshold parameter. This unit takes also input from theDigit Segmentation Unit (229 in FIG. 2) and from the Number of DigitsInput unit (233 in FIG. 2). This unit then feeds the threshold parameterinto the Image Binarization unit (224 in FIG. 2), to binarize the movingobject image data, which it gets from the moving object image datamemory (22 in FIG. 2) and stores in the Binary Image Data memory (23 inFIG. 2). The Image Binarization unit (224 in FIG. 2) can optionally getdata from the user through the Threshold Input Unit (231 in FIG. 2), orthrough the Automatic Threshold Calculation unit (232 in FIG. 2).

Data from Binary Image Data memory are then fed to the MorphologicalFiltering unit (225 in FIG. 2), for filtering of unwanted noise andstoring the filtered image data in the filtered Binary Image Data memory(24 in FIG. 2). Data from this memory are input to the ConnectedComponent Analysis (CCA) unit (226 in FIG. 2), which analyzes the binarydata to find blocks of pixels corresponding to regions (Blobs) and thenstores the results in the Region Data Memory (25 in FIG. 2).

The next step is the classification of the blobs in order to identifythe car plates. This procedure takes place into the RegionClassification unit (228 in FIG. 2) which analyzes the data previouslystored in the Region Data memory (25 in FIG. 2), using classificationcriteria defined by the user through the Classification CriteriaTrimming unit (227 in FIG. 2). The output of the Region Classificationunit, which is the plate coordinates are stored in the detected platecoordinates memory (26 in FIG. 2).

The results of this extraction are then fed into the Plate Output unit(234 in FIG. 2), which outputs the plates to the system output when theAutomatic Threshold Adaptation unit (223 in FIG. 2) indicates that theright number of digits have been detected.

A final step of processing concerns the segmentation of the plate digitsthat exist in the detected plate. This procedure takes place within theDigit Extraction Unit (229 in FIG. 2).

The results of this extraction are then fed into the Digit Output unit(230 in FIG. 2), which outputs the digits to the system output when theAutomatic Threshold Adaptation unit (223 in FIG. 2) indicates that theright number of digits have been detected.

In the following paragraphs the above-referred units are explainedanalytically.

Moving Object Detection Unit (227 in FIG. 2).

This unit detects the motion of pixels from consecutive video frames.The target is to identify one or more moving cars in a steady backgroundas viewed by the camera. The background corresponds to the view of thecamera when no car is present and nothing else moves. However thiscomplete absence of motion rarely occurs under real world conditions andtherefore the background is instead modeled according to a backgroundmodel. The background model is actually an image obtained using somestatistical methodology, which incorporates any minor differences thatmay occur due to slight variations in lighting conditions, electronicnoise from some camera sensor, or due to some minor motions inherent inthe video scene (e.g. tree leaves moving due to a blowing wind).

Given the background model, any moving pixels can be identified in avideo frame by subtracting the background model from this particularframe. Therefore referring to FIG. 3, the moving pixels (32 and 34 inFIG. 3) corresponding to a moving object within this video sequence areidentified by subtracting the background model image (33 in FIG. 3) fromthe current frame (31 in FIG. 3).

As the motion in the current frame becomes more intense, more pixels aredifferent from the background model.

The calculation of the background model can be achieved usingstatistical techniques: Each pixel in the background model is modeledthrough the use of the corresponding pixels of some consequent frames asshown in FIG. 4. More specifically, each pixel PBM_(k) in the backgroundmodel (43 in FIG. 4) results in a statistical measure of the centraltendency of the pixel population which is constituted by the pixelsP_(k1 . . . N) (41 in FIG. 4) in the consecutive video sequence framesT₁ . . . T_(N) (42 in FIG. 4) having the same coordinates as PBM_(k).Possible statistical measures of the central tendency include the meanvalue, the median value and the mode value. However in order to be ableto use this central tendency measure, a number of consecutive videoframes must be stored in a buffer memory and this constitutes asignificant problem in the case that the system is targeted to beimplemented as an embedded system. In an embedded system the memory isusually limited and therefore this type of implementation is notfeasible. The calculation of the mean value may be an exception to thisproblem since it is possible to be calculated as the running mean. Therunning mean value is calculated progressively as follows:

PBM _(k)=0.5 PBM_(k)+0.5 P_(k) , i=1 . . . N   (1a)

In an exemplary embodiment, a weighted average measure is used describedby the following relation:

PBM _(k) =aPBM _(k)+(1−a)P _(ki) , i=1 . . . N   (2a)

The difference between equations (1a) and (2a) is the parameter α, whichin the case of running average takes on the value 0.5. Values of asmaller than 0.5, make the system to be more robust to backgroundchanges. In this case the background model change faster or equivalentlythe system has limited memory and is able to forget its history. Asparameter α gets smaller, the background model changes faster.

More specifically, the procedure of detecting a car-plate is thefollowing: As a first step, the background model BM is calculated. Inthe first iteration, the background model is initialized with a zerovalue for every pixel (52 in FIG. 5). Then, the background model iscalculated (53 in FIG. 5) using eq. 2a.

The background model is then subtracted from the current frame (54 inFIG. 5). Finally the absolute value of the difference is checked forevery pixel against a threshold TH and the corresponding pixel iscategorized as background if D_(k)<TH and as moving object if D_(k)>TH(56 and 57 in FIG. 5). The parameter TH plays the role of motionsensitivity. The larger the parameter TH the less sensitive the systemwill be to small motions. This is a very useful feature since itcontrols the response of the system in noisy conditions where there aresmall motions distributed across the entire frame area, corresponding toconditions such as rain, wind etc.

As a final step, the system outputs the coordinates of the moving objectusing the following procedure: First all the coordinates of the pixelscharacterized as <<moving>> are sorted (58 in FIG. 5). From thisprocedure the minimum and maximum coordinates in the x-direction (x_minand x_max) as well as the minimum and maximum coordinates in they-direction (y_min and y_max) are computed. Then, rectangle Q₁Q₂Q₃Q₄ isformed (62 in FIG. 6), representing the moving portion of the videoframe (61 in FIG. 6), with the corner points Q₁ having the followingcoordinates: Q1=(x_min, y_min) Q2=(x_max, y_min) , Q3=(x_min, y_max)Q4=(x_max, y_max).

Image Binarization Unit (224 in FIG. 2)

The Binarization unit (224 in FIG. 2) focuses on the binarization of theinput image. A binarization procedure is considered the formation of anew image having pixels with only two possible values. In the context ofthe current invention these values can be either 0 (black) or 255(white).

The binarization procedure employs the comparison of each pixel in theimage with a threshold value TH_bin and then forms a new binary imagehaving a one to one correspondence with the initial image described asfollows: Pixels in the original image with a value greater than TH_bincorrespond to pixels with value 255 in the binary image and pixels inthe original image with a value lower than TH_bin correspond to pixelswith value 0 in the binary image.

However binarization using a global threshold is not an optimalsolution. A major problem with global thresholding is that changes inillumination across the scene may cause some parts to be brighter (inthe light) and some parts darker (in shadow) in ways that have nothingto do with the objects in the image.

Such uneven illumination can be handled by determining thresholdslocally. That is, instead of having a single global threshold, we allowthe threshold itself to smoothly vary across the image.

Local Thresholding

In the current invention, we use a local thresholding method, which useslocal edge properties in a window to compute threshold.

Automatic Threshold Adaptation Unit (223 in FIG. 2)

The selection of the threshold in the Image Binarization unit is a verycritical task, since it influences the content of the binary image andfinally the precision of the detection system. Usually the value of thisthreshold changes with the content of the image or with the lightingconditions. Therefore the use of a constant (global or local) threshold,although an option, is not optimal. To this end an automatic thresholdadaptation unit is included in the system described in the currentinvention. The system is able to adapt a global or local thresholdaccording to the results of the detection process.

In an exemplary embodiment, a local thresholding approach is used whichemploys threshold adaptation using feedback from the system output andmore specifically from the Digit Segmentation unit (229 in FIG. 2).

More specifically the unit functions as follows: For every frame I_(k)(71 in FIG. 7) an edge-map is obtained (76 in FIG. 7).

An edge map is defined as an image containing image edges. An image edgeis a point in a digital image at which the image brightness changessharply or, more formally, has discontinuities. The points at whichimage brightness changes sharply are typically organized into a set ofcurved line segments termed edges.

Edge detection is the process of obtaining the edge-map of an image. Thedetection process typically employs the filtering of an image byconvolving a standard matrix known as an “operator” with the image. Thisfiltering process results in an image having increased intensity for thepixels belonging to an edge and decreased intensity for pixelsnot-belonging to an edge. Usually as a final step, the binary edge mapis obtained by applying binarization, using thresholding, to theedge-map image. This results in an image which has white pixels at theedges and black pixels everywhere else.

In an exemplary embodiment, the binary edge map (76 in FIG. 7) of frameI_(K) (71 in FIG. 7) is obtained by first applying an edge filteringusing a Sobel operator (74 in FIG. 7)[1] and then binarization usingthresholding (75 in FIG. 7). The threshold value for the binarizationunit is obtained from the Threshold Trimming sub-system (751 in FIG. 7)described below.

The Threshold Trimming sub-system functions as follows: An arbitrary,pre-determined initial threshold value THRES_1=THRES_1 _(INI) is set. Tobe equal to the smaller integer number which is closest to the value2^(Nb)/2, where Nb is the number of bits used to represent the pixelvalue (e.g for 8 bits representation this number equals 127). The platedetection and digit segmentation process is then run and when the platedetection and the digit segmentation process is finished, the number ofdetected digits is fed from the Digit Segmentation unit (229 in FIG. 2)and the required number of digits that must be detected is input fromthe Number of Digits Input unit (233 in FIG. 2). If the number of thedetected digits is smaller than the required number of digits, thethreshold value THRES_1 is decreased and the detection is re-initiated.If the number of the detected digits is higher than the required numberof digits, the threshold value THRES_1 is increased and the detection isre-initiated. This process is repeated until the number of the detecteddigits is equal to the required number of digits.

Each threshold from Threshold Trimming sub-system is fed to theThresholding I sub-system (75 in FIG. 7), which binarizes the edge mapby using thresholding, to obtain the binary edge map J_(K) (76 in FIG.7).

As a next step, the Input Frame I_(K) and the binary edge map E_(K) ispartitioned into a number N_(BX)xN_(BY) blocks of dimensions wxw pixelseach. Then for every from frame I_(K) the following procedure takesplace iteratively for every frame I^(K) _(ij) (75 in FIG. 7):

The block I^(K) _(ij) is taken (75 in FIG. 7) and from the binary imageE_(K) the corresponding block E_(ij) block is taken (78 in FIG. 7). Foreach of these blocks a binarization process is then taking place asfollows: First the I^(K) _(ij) block (75 in FIG. 7) is multiplied (79 inFIG. 7) with the corresponding E^(K) _(ij) block (78 in FIG. 7) ofbinary Image E_(K). The resulting block D^(K) _(ij) (791 in FIG. 7) is asemi-binary image, containing pixels having the gray-scale value of thecorresponding pixel in I^(K) _(ij) when the corresponding pixel in E^(K)_(ij) has a non-zero value (e.g. the pixel is on an edge) and zeroeverywhere else.

The next step is the binarization of this semi-binary block D^(K) _(ij)by applying a thresholding scheme (792 in FIG. 7), using a thresholdcalculated by the following formula:

THRES_2=Σ_(x) ^(W) Σ_(y) ^(W) D_(ij) ^(xy)   (1)

,where D_(ij) ^(xy) is the pixel in x-th column and the y-th row of theD^(K) _(ij) block. The result is a binary version B^(K) _(ij) (793 inFIG. 7) of the block I^(K) _(ij) of the video frame I_(K).

Automatic Threshold Calculation Unit (232 in FIG. 2)

Alternative to automatic threshold adaptation, an automatic ThresholdCalculation unit can be used. To this end a global threshold calculationalgorithm can be used which can lead to acceptable performance.

There are a few automatic global threshold calculation approaches thatcan be used in this system [2]:

Algorithm of Ridler and Calvard, which optimizes the process of changinga gray-level image to a bimodal image, while retaining the best possibleillumination of the image.

Algorithm of Otsu, which is a classical algorithm in image binarization.This algorithm transforms a gray-level image to a binary image forclassifying foreground and background with a global threshold. Thisalgorithm can be applied iteratively to a gray-scale histogram of animage for generating threshold candidates.

Algorithm of Pun proposes an optimal criterion for image thresholding.This criterion is corrected and improved by Kapur et al. which revisedand improved Pun's algorithm by assuming two probability distributionsfor objects and background as well as maximizing the entropy of theimage to obtain the optimal threshold.

Algorithm of Kittler and Illingworth, proposing a minimum errorthresholding algorithm that minimizes the probability of classificationerror by fitting error expression. It is assumed that a mixture of twoGaussians distributions of object and background pixels can characterizethe image.

Algorithm of Fan et al., proposing a fast entropic technique to obtain aglobal threshold automatically by reducing complexity in computation.

Algorithm of Portes de Albuquerque et al. proposing an entropicthresholding algorithm, which is customized from non-extensive Tsallisentropy concept.

Algorithm of Xiao et al. proposing an entropic thresholding algorithmbased on the gray-level spatial correlation (GLSC) histogram. This is arevision and extension of Kapur et al.'s algorithm.

In one exemplary embodiment, the algorithm of Kapur has been selectedfor implementation [3]. This algorithm assumes two probabilitydistributions, for objects p_(obg) (foreground) and background p_(bg),and maximizes the between-class entropy of the image to obtain theoptimal threshold.

The between-class entropy of the threshold image is defined as:

$\begin{matrix}{{f_{1\;}({TH})} = {{H\left( {0,{TH}} \right)} + {H\left( {{TH},L} \right)}}} & (2) \\{where} & \; \\{{H\left( {0,{TH}} \right)} = {- {\sum\limits_{i = 1}^{TH}{\frac{p_{i}}{p_{obj}}\ln \frac{p_{i}}{p_{obj}}}}}} & (3) \\{{H\left( {{TH},L} \right)} = {- {\sum\limits_{i = {{TH} + 1}}^{L}{\frac{p_{i}}{p_{bg}}\ln \frac{p_{i}}{p_{bg}}}}}} & (4) \\{and} & \; \\{p_{obj} = {- {\sum\limits_{i = 0}^{TH}p_{i}}}} & (5) \\{p_{bg} = {1 - p_{obj}}} & (6)\end{matrix}$

p_(i) is the probability of a pixel value to appear in the current imageand is defined as the ratio of the appearances of a value to the totalnumber of pixels.

For bi-level thresholding, the optimal threshold is:

TH _(optimal)=ArgMax{f ₁(TH)}  (7)

In other words the optimal threshold value is the value of TH for whichthe quantity f₁ is maximized for each frame.

Threshold Input Unit (231 in FIG. 2)

This unit is an input unit, which can be used optionally to input athreshold value manually.

Morphological Filtering Unit (225 in FIG. 2)

In the presence of electronic noise, or physical obstacles (e.g. dust)the binarization process may result in binary noise. Binary noisemanifests as white spots. These spots can cause a significant increaseof the processing time. This is because the Connected Component Analysisunit (232 in FIG. 2) separately analyzes each non-black pixel to see ifit is physically connected to any other pixel.

To overcome this problem, the Morphological Filtering unit cleans anyisolated pixels in order to eliminate these pixels and to produce a more“clear” binary image.

The unit implements the following morphological operation: In each videoframe (80 in FIG. 8) a 3×3 mask is formed (81 in FIG. 8) and startsrolling from the first pixel within a binary image from position (0,0)towards higher x and y coordinates.

For each window the number of black N_(b) and the number of white pixelsN_(w) is counted. Then if N_(b)>N_(w) the central pixel of the 3×3window is set to have black value (82 in FIG. 8) else the central pixelof the 3×3 window is set to have white value (83 in FIG. 8).

Connected Component Analysis Unit (226 in FIG. 2)

This unit aims at the labeling of the binary image regions using aconnected components algorithm. The target is to label each objectwithin the binary image and this incorporates the labeling of each pixelwith a label. Pixels that are somehow connected are given the samelabel. At the end of this procedure, pixels with the same labelcorresponding to an object, having the same label as its constituting(labeled) pixels.

In an exemplary embodiment, a run-length based connect componentalgorithm is used [4], which is similar to the two-pass connectedcomponent algorithm[5], but here run-lengths are used rather than pixelsresulting in a more efficient implementation in terms of computer memoryand processing power.

The stages involved in this implementation are as follows:

-   -   1. Encoding pixels to runs (using run-length encoding);    -   2. Initial labeling and propagation of labels    -   3. Resolving of conflicts; and    -   4. Translating run labels to connected component.

Encoding Pixels to Runs (Using Run-Length Encoding),

In accordance with an exemplary embodiment of the current invention, arun-length encoding representation is followed for labeling. Therun-length encoded format is also much more compact than a binary image(individual runs have a single label), and so the sequential labelpropagation stage that follows, is much faster than the conventionalalgorithm.

Run-length encoding works as follows: Consider the binary image frame(91 in FIG. 9). The target is to encode the contiguous foreground pixels(black colored), which, when working in rows, they are nothing else butblack lines. For each line the starting pixel x-coordinate s, the endpixel coordinate e and the row r of that the line is recorded. Forexample line L₁ in FIG. 9 (92 in FIG. 9) starts at the first pixel ofthat row, so s=0, ends at the 5-th pixel of that row (thus e=4) and liesat the second row (thus r=1). Therefore this line is encoded as (0,4,1)and this code is also called a Run. The same procedure is followed forevery line in the image. A run is complete when the end of a row isreached or when a background pixel is reached. The maximum possiblenumber of runs in an image of size M×N is 2MN and the flow of therelated algorithm is shown in FIG. 10.

Initial Labeling and Propagation of Labels

This stage involves initial labeling and propagation of labels (FIG.11). The IDs and equivalences (EQs) of all runs are initialized to zero.This is followed by a raster scan of the runs; assigning provisionallabels, which propagate to any adjacent, runs on the row below. For anyunassigned run (IDi=0) a unique value is assigned to both its ID and EQ.

After that, the 4-way or 8-way connectivity is checked of each run. In4-way connectivity, the adjacent pixels in four directions (up, down,left, right) are checked. If they are foreground pixels then areconnected otherwise they are un-connected. Consider for example pixelsP₃ (98 in FIGS. 9) and P₄ (97 in FIG. 9). These are 4-way connectedsince pixel P₃ is on the left of pixel P₄.

In 8-way connectivity, the diagonal directions are also checked.Consider for example pixels P₁ (95 in FIGS. 9) and P₂ (96 in FIG. 9),which are not 4-way connected to each other. However P₂ is in thediagonal direction of P₁, so P₁ and P₂ are 8-way connected.

For each Run with identity ID_(i) excluding Runs on the last row of theimage, Runs R_(j) one row below the R_(i) is checked for a connection.In terms of run-length encoded lines, 4-way connection between two RunsR_(i),R_(j) means that the following conditions hold:

s_(i)≦e_(j)   (8)

and

e_(i)≧s_(j)

8-way connection between two Runs R_(i), R_(j) means that the followingconditions hold:

s _(i) ≦e _(j)+1   (10)

and

e _(i)+1≦s _(j)   (11)

a connected run in the row below r_(i), is assigned the identity ID_(i),if and only if its ID, ID_(j) is unassigned. If there is a conflict(e.g. if an overlapping run has assigned ID_(j)), the equivalence of runI (the EQ_(i)) is set to IDj

Resolving of Conflicts

The EQ and ID values should be equal. A differentiation between thosetwo values for some run indicates the presence of some conflict, whichoccasionally happens when special shaped objects are encountered. Thus athird stage must be included for resolving those conflicts. For examplethis problem may be occurred when a ‘U’-shaped object is encountered. Asshown in FIG. 12, applying labeling algorithm to the ‘U’-shaped object(123 in FIG. 12) will generate four runs R₁, R₂, R₃, R₄, each withunassigned ID and

The solution is a conflict-resolving algorithm, which follows a serialprocedure, which scans all the runs sequentially, in the way shown inFIG. 13.

Translating Run Labels to Connected Component.

At the end of this procedure, each run has a label; so it isstraightforward to obtain the final components, by simply gather theruns having the same labels.

Region Classification Unit (228 in FIG. 2)

The aim of this unit is to classify each region identified with the helpof the CCA unit (227 in FIG. 2) and stored to the CCA data memory (25 inFIG. 2), in order to classify this region as a car-plate or not. To thisend, several characteristic features of each region are measured. Thesefeatures are forming then a vector characterizing this region and thenare classified.

The region classification procedure includes two steps: The regionfeature extraction and the region classification.

Region Feature Extraction

Region feature extraction includes the measurement of severalcharacteristic features of each region (142 in FIG. 14). The featuresthat are measured are the following:

The width of the region: Width corresponds to the width of a rectanglesurrounding the region under consideration (144 in FIG. 14). The widthof the rectangle is computed as the difference of the maximum xcoordinate minus the minimum x coordinate.

The area that the region occupies: This is the area occupied by arectangle surrounding the region under consideration (144 in FIG. 14)measured in square pixels. The width of the rectangle is computed as thedifference of the maximum x coordinate minus the minimum x coordinate,and the height of the rectangle is computed as the difference of themaximum y coordinate minus the minimum y coordinate. The area equals theproduct of width by the height of the rectangle.

The magnitude of the region: This is the count of the non-white pixelsNNW, of the connected region and is measured in pixels.

The plenitude of a region: This measure indicates how full the regionunder consideration is. For example a region containing gaps will haveless plenitude in relation with a region not having gaps. The plenitudeof a region is defined as the ratio of the area to the magnitudefeatures defined above.

The aspect ratio of a rectangle surrounding the region underconsideration (143 in FIG. 14): The region under test is surrounded by arectangle. The ratio of this rectangles height to the rectangles widthgives the aspect ratio of that region.

Number of Scan-lines intersection points: Several “virtual” lines of1-pixel thickness are considered that intersect the region at differentheights (144 in FIG. 14). The system records the number of pixels thateach scan line meets in each track throughout the region and produces afeature vector FV_(SL) of cardinality N_(SL), where N_(SD) is equal tothe number of the scan lines and contains the ID of each scan-line andthe number of pixels that this line intersects. As an example, considerthe scan lines indicated in FIG. 14, (144 in FIG. 14). Since the firstline intersects with 2 pixels, the second line with 3 pixels and thethird line with three pixels, then this future vector isFV_(SL)={1,2,2,3,3,3}.

Statistical normalized central moments (Hue Moments). Statisticalmanipulation of the pixels and their coordinates within a region resultin the formation of a set of region-specific features called statisticalmoments [6]. Central moments are given by the following expression:

μ_(pq)=Σ_(x) Σ_(y)(x−x )^(p)(y−y )^(q)   (12)

In Eq. (12) x, y are the x and y coordinates of each pixel in the regionand x, y are the mean values of all x and all y coordinates respectivelyfor each non-white pixel within this region. Integer numbers p and q,determine the order of a statistical moment. Combinations of low orderstatistical moments (up to the order of 2 e.g. μ₀₂ to μ₁₁), representsome physical measure of the region as the mean, the mass-center, theskewness, the angle with the x-axis etc. For example, the angle of aregion with the horizontal x-axis is given by the following expression:

$\begin{matrix}{\theta = {\arg \; {\tan \left( \frac{2m_{11}}{m_{20} - m_{02}} \right)}}} & (13)\end{matrix}$

In an exemplary embodiment, the calculation of these statistical momentsis performed in the encoded space and on the run-length encoded runs. Asit has been described above, each run is described by three numbersnamely s_(i), e_(i), r_(i), which indicate the start and the end on thex-direction as well as the row of each non-white pixel within the regionunder consideration. If this type of description is used, eq. 12 cannotbe directly applied, since the coordinates of each pixel in the regionunder consideration is not available. To this end eqn. 12 should bemodified accordingly. Below, this modification of the central moments isgiven for order up to 3 (p+q≦3).

$\begin{matrix}{\mspace{79mu} {\mu_{11} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{{r_{i}\left( \frac{s_{i} + e_{i}}{2} \right)}\left( {e_{i} - s_{i} + 1} \right)}}} - \overset{\_}{y}}}} & (14) \\{\mu_{20} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{\left( \frac{e_{i} - s_{i} + 1}{6} \right)\left\lbrack {\left( {e_{i} + s_{i}} \right)^{2} + {e_{i}\left( {e_{i} + 1} \right)} + {s_{i}\left( {s_{i} - 1} \right)}} \right\rbrack}}} - {\overset{\_}{x}}^{2}}} & (15) \\{\mspace{79mu} {\mu_{02} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{r_{i}^{2}\left( {e_{i} - s_{i} + 1} \right)}}} - {\overset{\_}{y}}^{2}}}} & (16) \\{\mspace{79mu} {\mu_{12} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{{r_{i}^{2}\left( \frac{s_{i} + e_{i}}{2} \right)}\left( {e_{i} - s_{i} + 1} \right)}}} - {2\overset{\_}{y}\mu_{11}} - {\overset{\_}{x}\; \mu_{02}} + {\overset{\_}{x}{\overset{\_}{y}}^{2}}}}} & (17) \\{\mu_{21} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{{r_{i}\left( \frac{e_{i} - s_{i} + 1}{6} \right)}\left\lbrack {\left( {e_{i} + s_{i}} \right)^{2} + {e_{i}\left( {e_{i} + 1} \right)} + {s_{i}\left( {s_{i} - 1} \right)}} \right\rbrack}}} - {\overset{\_}{y}\; \mu_{20}} - {2\overset{\_}{x}\; \mu_{11}} - {{\overset{\_}{x}}^{2}\overset{\_}{y}}}} & (18) \\{\mspace{79mu} {\mu_{03} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{r_{i}^{3}\left( {e_{i} - s_{i} + 1} \right)}}} - {3\overset{\_}{y}\; \mu_{02}} - {\overset{\_}{y}}^{3}}}} & (19) \\{{\mu_{30} = {{\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{\left( \frac{e_{i} - s_{i} + 1}{4} \right)\left\lbrack {e_{i}^{3} - {2s_{i}^{3}} + {s_{i}^{2}\left( {e_{i} - 1} \right)} + {e_{i}\left( {s_{i} + 1} \right)}} \right\rbrack}}} - {3\overset{\_}{x\;}\mu_{20}} - {\overset{\_}{x}}^{3}}},} & (20) \\{\mspace{79mu} {where}} & \; \\{\mspace{79mu} {\overset{\_}{x} = {\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{{r_{i}\left( \frac{s_{i} + e_{i}}{2} \right)}\left( {e_{i} - s_{i} + 1} \right)}}}}} & (21) \\{\mspace{79mu} {\overset{\_}{y} = {\frac{1}{N_{NW}}{\sum\limits_{i}^{\;}{r_{i}\left( {e_{i} - s_{i} + 1} \right)}}}}} & (22)\end{matrix}$

One interesting modification of these moments, results when the centralmoments are normalized used following relation:

$\begin{matrix}{{n_{pq} = \frac{\mu_{pq}}{\mu_{00}^{\gamma}}},{{{where}\mspace{14mu} \gamma} = {\frac{p + q}{2} + 1}}} & (23)\end{matrix}$

By using these normalized central moments, a new set of statisticalmoments can be formed, known as the Hu moments I_(i), given by thefollowing relations

I ₁ =n ₂₀ +n ₀₂   (24)

I ₂=(n ₂₀ −n ₀₂)²+4n ₁₁ ²   (25)

I ₃=(n ₃₀−3n ₁₂)²+(3n ₂₁ −n ₀₃)²   (26)

I ₄=(n ₃₀ +n ₁₂)²+(n ₂₁ +n ₀₃)²   (27)

I ₅=(n ₃₀−3n ₁₂)(n ₃₀ +n ₁₂)[(n ₃₀ +n ₁₂)²−3(n ₂₁ +n ₀₃)²]+(3n ₂₁ −n₀₃)(n ₂₁ +n ₀₃)[3(n ₃₀ +n ₁₂)²−(n ₂₁ +n ₀₃)²]  (28)

I ₆=(n ₂₀ −n ₀₂)[(n ₃₀ +n ₁₂)²−(n ₂₁ +n ₀₃)²]+4n ₁₁(n ₃₀ +n ₁₂)(n ₂₁ +n₀₃)   (29)

I ₇=(3n ₂₁ −n ₀₃)(n ₃₀ +n ₁₂)[(n ₃₀ +n ₁₂)²−3(n ₂₁ +n ₀₃)²]−(n ₃₀−3n₁₂)(n ₂₁ +n ₀₃)[3(n ₃₀ +n ₁₂)²−(n ₂₁ +n ₀₃)²]  (30)

In a different implementation the run-length encoded region underconsideration, is first decoded in order to obtain the initial binaryimage corresponding to this region. In this case, equation 12 is applieddirectly. The procedure that is followed in order to do this is analyzedbelow, in the description of the digit segmentation unit.

The feature vector FV_(HM)={I₁, I₂, I₃, I₄, I₅, I₆, I₇} resulting fromthis set of features contains up to 7 numbers corresponding to the 7 Humoments I₁ to I₇ as described in Eqs. 24-30.

Region Classification

The region classification aims to the classification of each regionunder consideration as a car-plate or not, using also input from theClassification Criteria Trimming unit (227 in FIG. 2).

In implementing an exemplary embodiment, a pattern classification schemeis used for region classification. To this end, the system has beenpreviously trained offline, using a database with regions correspondingto plates and with regions corresponding to non-plates. For each region,the features described in the previous section are evaluated and a totalfeature vector is formed. The feature vector is then projected in thefeature space, defined as a multi-dimensional space with as manydimensions as the feature vector. In such a projection, the featurevectors corresponding to plate and non-plate regions are concentrated(clustered) in separate areas of the multi-dimensional feature space.Consider the example shown in FIG. 15 incorporating a 3-dimensionalfeature vector FV={FV₁, FV₂, FV₃}, which builds a 3 dimensional featurespace (151 in FIG. 15). Each point in this space is defined by the threecoordinates FV₁, FV₂, FV₃. The projection of the several regions on thisaxis-system creates two clusters one for the regions corresponding toplates (153 in FIG. 15) and one for the regions not corresponding toplates (152 in FIG. 15).

The next step is to define the centers of the individual clusters. Inaccordance with one exemplary embodiment, this is achieved via thecalculation of the center of mass of each cluster. The center of masshas coordinates FV _(C)={FV ₁, FV ₂, . . . , FV _(D)} where D is thedimensionality of the feature space, and each coordinate FV _(k) isdefined as:

$\begin{matrix}{{\overset{\_}{FV}}_{k} = {\frac{1}{N_{NS}}{\sum\limits_{i}^{\;}{FV}_{ki}}}} & (31)\end{matrix}$

where N_(S) is the number of samples (regions) participating in eachcluster. In the 3-dimensional example referred before, the centers ofthe clusters are indicated as C1 (156 in FIGS. 15) and C2 (157 in FIG.15).

When a new region T is tested, its feature vector FV_(T) is obtained.This corresponds to a point in the feature space. In order to test intowhich cluster this test point belongs, the distance of this point fromthe centers of the clusters is computed using some distance measure suchas the L1 distance, L2 distance, the Mahalanobis distance etc.

In one exemplary embodiment, the L2 distance is used which is defined asfollows:

in Cartesian coordinates, if p=(p₁, p₂, . . . , p_(n)) and q=(q₁, q₂, .. . , q_(n)) are two points in Euclidean n-space, then the L2 orEuclidean distance from p to q, or from q to p is given by the followingexpression:

d(p, q)=d(q, p)=√{square root over (Σ_(i=1) ^(n)(q _(i) −p _(i))²)}  

In the 3-dimensional example of FIG. 15, the distance of the test pointT (155 in FIG. 15) from the cluster-center C1 (152 in FIG. 15) is d1(158 in FIG. 15) and from the cluster-center C2 (157 in FIG. 15) is d2(154 in FIG. 15).

Once the distances of the test point from the centers of the clustersare computed, the decision about into which cluster this point belongsto is taken according a proximity criterion. That is, the point belongsto the nearest cluster according to the distance measure used. Once thisdecision has been made, the region under test has been classified asplate or non-plate.

While the above description utilizes a specific classifier, it isunderstood that an Artificial Neural Network classifier or any othertype of classifier can be used.

An alternative to pattern classification, is the feature filteringimplementation. In this scheme, a region can be classified as plate ornon-plate according to some empirical measures corresponding to physicalproperties of each region, or some empirical observations.

To this end the features of width, magnitude, aspect ratio, plenitude,scan-lines and the angle with the x-axis (eq. 13) are used. The targetis a formation of a decision vector as follows:

Each of the above-mentioned features is checked against a target valueor a range of target values rule (TABLE 1), which are in turn obtainedfrom empirical observations or from governmental standards. These rulesare input from the Classification Criteria Trimming unit (227 in FIG.2).

Conformance to the target value corresponds to a true indication and anon-conformance to the target value corresponds to a false indication.To this end a binary decision vector DV is obtained as follows:

DV={D _(width) _(_) _(rule) , D _(magnitude) _(_) _(rule) , D _(aspect)_(_) _(ratio) _(_) _(rule) , D _(plenitude) _(_) _(rule) , D_(scan-lines) _(_) _(rule) , D _(angle) _(_) _(rule)}

TABLE 1 Feature Target Value Rule (example) Width >100 AND <300Magnitude >1000 AND <5000 Aspect ratio >3 AND <5 Plenitude >0.5 AND <0.9Scan-lines  >5 AND <12 Angle <5

A simple approach is to classify the region as a plate if and only ifall the logic vector containing logic ones, meaning that the all thefeature values conforming to the target values.

However in the current implementation a decision fusion rule is formedleading to optimal results. This fusion rule is the following

FR={[D _(width) _(_) _(rule) AND D _(aspect) _(_) _(ratio) _(_) _(rule)AND D _(angle) _(_) _(rule)]OR[D _(plenitude) _(_) _(rule) AND D_(scan-lines) _(_) _(rule)]}

If FR is TRUE then the region is classified as a plate, while if FR isFALSE the region is classified as a non-plate.

The target value rules can be change when is needed (e.g. the systemneed to be trimmed for a different country) through the ClassificationCriteria Trimming unit (227 in FIG. 2).

Classification Criteria Trimming Unit (227 in FIG. 2)

This unit is used for input target value rules to the regionclassification unit (228 in FIG. 2)

Plate Output Unit (234 in FIG. 2)

The aim of this unit is to output the coordinates of each regionclassified as a car plate. The unit outputs the plate if and only if theAutomatic Threshold Adaptation unit (223 in FIG. 2), indicate that theright number of digits have been detected.

Digit Segmentation Unit (229 in FIG. 2)

The aim of this unit is to segment the individual digits constituting acar-plate in order to be able to be output from the system in binaryform to an Optical Character Recognition (OCR) system.

The digits in a binary plate image appear as coherent regions (161 inFIG. 16). Therefore the unit performs a CCA analysis similar with theanalysis performed in CCA unit (226 in FIG. 2). However on top of theplate digits, the plate image usually contains additional regionscorresponding to e.g. the plate border-line (163 in FIG. 16), separationand state signs (166 in FIG. 16), noise (162 in FIG. 16) etc. To thisend an additional filtering scheme is applied in order to filter-out anyregions not corresponding to digits. This filtering scheme includes thecomputation of a simple feature and checking this feature against atarget value rule.

The CCA analysis performed in this unit follows steps 2 and 3 of the CCAanalysis performing the CCA unit, leaded by an extra step, which is thebackground-foreground inversion. In the first CCA analysis, the digitsof the plate appear as white holes (background), since the digits areusually black. To this end they are not run-length encoded and thusinformation about them cannot be extracted. To this end abackground-foreground inversion must be carried out for the regionsdetected as plates using a procedure, which for a region containing Nruns is shown in FIG. 17.

Once the background-foreground inversion has been carried out, the newinverted runs must be de-coded in binary image format (pixelscoordinates and values). This process is straightforward andincorporates the use of a structured image memory, which is loaded withpixels values at coordinates indicated by the run-length code.Analytically the process followed in the current implementation for aregion containing N runs, is shown in FIG. 18.

Digit Output Unit (230 in FIG. 2)

The aim of this unit is to output the digits to the system output whenthe Automatic Threshold Adaptation unit (223 in FIG. 2) indicates thatthe right number of digits has been detected.

The systems, methods and techniques described herein performed orimplemented on any device that comprises at least one camera, includingbut not limited to, standalone cameras, security cameras, smart cameras,industrial cameras, mobile phones, tablet computers, laptop computerssmart TV sets and car boxes, i.e. a device embedded or installed in anautomobile that collects video and images. It will be understood and isappreciated by persons skilled in the art, that one or more processes,sub-processes or process steps described in embodiments of the presentinvention can be implemented in hardware and/ or software.

While the above-described flowcharts and methods have been discussed inrelation to a particular sequence of events, it should be appreciatedthat changes to this sequence can occur without materially effecting theoperation of the invention. Additionally, the exemplary techniquesillustrated herein are not limited to the specifically illustratedembodiments but can also be utilized and combined with the otherexemplary embodiments and each described feature is individually andseparately claimable.

Additionally, the systems, methods and protocols of this invention canbe implemented on a special purpose computer, a programmedmicroprocessor or microcontroller and peripheral integrated circuitelement(s), an ASIC or other integrated circuit, a digital signalprocessor, a hard-wired electronic or logic circuit such as discreteelement circuit, a programmable logic device such as PLD, PLA, FPGA,PAL, any comparable means, or the like. In general, any device capableof implementing (or configurable to implement) a state machine that isin turn capable of implementing (or configurable to implement) themethodology illustrated herein can be used to implement the variousmethods, protocols and techniques according to this invention.

Furthermore, the disclosed methods may be readily implemented insoftware using object or object-oriented software developmentenvironments that provide portable source code that can be used on avariety of computer or workstation platforms. Alternatively, thedisclosed system may be implemented partially or fully in hardware usingstandard logic circuits or VLSI design. Whether software or hardware isused to implement the systems in accordance with this invention isdependent on the speed and/or efficiency requirements of the system, theparticular function, and the particular software or hardware systems ormicroprocessor or microcomputer systems being utilized. The systems andmethods illustrated herein can be readily implemented in hardware and/orsoftware using any known or later developed systems or structures,devices and/or software by those of ordinary skill in the applicable artfrom the functional description provided herein and with a general basicknowledge of the video processing arts.

Moreover, the disclosed methods may be readily implemented in softwarethat can be stored on a storage medium, executed on programmedgeneral-purpose computer with the cooperation of a controller andmemory, a special purpose computer, a microprocessor, or the like. Inthese instances, the systems and methods of this invention can beimplemented as program embedded on personal computer such as an applet,JAVA™ or CGI script, as a resource residing on a server or computerworkstation, as a routine embedded in a dedicated system or systemcomponent, or the like. The system can also be implemented by physicallyincorporating the system and/or method into a software and/or hardwaresystem, such as the hardware and software systems of an electronicdevice.

It is therefore apparent that there has been provided, in accordancewith the present invention, systems and methods for the detection ofmultiple number-plates of moving vehicles. While this invention has beendescribed in conjunction with a number of embodiments, it is evidentthat many alternatives, modifications and variations would be or areapparent to those of ordinary skill in the applicable arts. Accordingly,it is intended to embrace all such alternatives, modifications,equivalents and variations that are within the spirit and scope of thisinvention.

REFERENCES

(All of which are incorporated herein by reference in their entirety)

1. Sobel operator, Wikipedia,http://en.wikipedia.org/wiki/Sobel_operator

2. M. Athimethphat, “A Review on Global Binarization Algorithms forDegraded Document Images”, AU J.T. 14(3): 188-195 (January 2011).

3. J. N. Kapur et all. “A new method for gray-level picture thresholdingusing the entropy of the histogram”, Computer Vision, Graphics and ImageProcessing, 29, 273-285, 1985.

4. Kofi Appiaha, Andrew Huntera, Hongying Menga, Patrick Dickinson,“Accelerated hardware object extraction and labeling: from objectsegmentation to connected components labeling”, Preprint submitted toComputer Vision and Image Understanding Aug. 22, 2009

5. N. Ma, D. G. Bailey, and C. T. Johnston, “Optimized single passconnected components analysis” IEEE International Conference onField-Programmable Technology, 2008.

6. R. C. Gonzalez, R. E. Woods, “Digital Image Processing”, pages:514-516, Addison-Wesley, 1993.

1-8. (canceled)
 9. A system comprising: a storage system; and acar-plate detection system that identifies car plates from one or moreimages captured by a camera, and utilizes a scan-lines feature vector ina region of consideration in combination with an empirical rule or anautomatic classification system.
 10. The system of claim 9, wherein thescan-lines feature vector is computed by considering several virtuallines of 1-pixel thickness that intersect a binary form of the region ofconsideration at different heights; wherein said system records thenumber of pixels that each scan line meets in each track throughout theregion; and wherein said system produces a feature vector which containsthe ID of each scan-line and the number of pixels that this lineintersects in the region of classification.
 11. A method comprising:receiving one or more images from a camera; identifying car plates fromthe one or more images captured by the camera; and utilizing ascan-lines feature vector in a region of consideration in combinationwith an empirical rule or an automatic classification system.
 12. Themethod of claim 11, wherein the scan-lines feature vector is computed byconsidering several virtual lines of 1-pixel thickness that intersect abinary form of the region of consideration at different heights; whereinsaid method records the number of pixels that each scan line meets ineach track throughout the region; and wherein said method produces afeature vector which contains the ID of each scan-line and the number ofpixels that this line intersects in the region of classification.